    Towards Keypoint Guided Self-Supervised Depth Estimation

    This paper proposes to use keypoints as a self-supervision clue for learning depth map estimation from a collection of input images. As ground truth depth from real images is difficult to obtain, there are many unsupervised and self-supervised approaches to depth estimation that have been proposed. Most of these unsupervised approaches use depth map and ego-motion estimations to reproject the pixels from the current image into the adjacent image from the image collection. Depth and ego-motion estimations are evaluated based on pixel intensity differences between the correspondent original and reprojected pixels. Instead of reprojecting the individual pixels, we propose to first select image keypoints in both images and then reproject and compare the correspondent keypoints of the two images. The keypoints should describe the distinctive image features well. By learning a deep model with and without the keypoint extraction technique, we show that using the keypoints improve the depth estimation learning. We also propose some future directions for keypoint-guided learning of structure-from-motion problems

    On the Comparison of Classic and Deep Keypoint Detector and Descriptor Methods

    The purpose of this study is to give a performance comparison between several classic hand-crafted and deep key-point detector and descriptor methods. In particular, we consider the following classical algorithms: SIFT, SURF, ORB, FAST, BRISK, MSER, HARRIS, KAZE, AKAZE, AGAST, GFTT, FREAK, BRIEF and RootSIFT, where a subset of all combinations is paired into detector-descriptor pipelines. Additionally, we analyze the performance of two recent and perspective deep detector-descriptor models, LF-Net and SuperPoint. Our benchmark relies on the HPSequences dataset that provides real and diverse images under various geometric and illumination changes. We analyze the performance on three evaluation tasks: keypoint verification, image matching and keypoint retrieval. The results show that certain classic and deep approaches are still comparable, with some classic detector-descriptor combinations overperforming pretrained deep models. In terms of the execution times of tested implementations, SuperPoint model is the fastest, followed by ORB. The source code is published on \url{https://github.com/kristijanbartol/keypoint-algorithms-benchmark}

    Model dubokog učenja za procjenu mjera ljudskog tijela iz slika

    The understanding of body measurements in and between populations is important and has many applications in medicine surveying, the fashion industry, fitness, and entertainment. Recent advances in human body measurement and shape estimation have been significantly driven by statistical models and deep learning, enabling methods that estimate 3D human meshes from 3D point clouds and 2D images - so called mesh regression methods. This thesis builds upon the state-of-the-art mesh regression approaches from multiple images. The first step is to propose the simplest method and use it as a baseline. The baseline is a linear regression models that takes only person's self-estimated height and weight and estimates the corresponding mesh. The baseline performs surprisingly well compared to the state-of-the-art methods. The second contribution is a 3D human pose estimation model from multiple camera views. The novelty of the model is in the fact that it can take any set of camera views as input, regardless of their relative arrangement and the number of cameras. The third contribution is a model for estimating the parameters of human pose, shape, and clothes from a single image. The estimated parameters are interpretable and, thus, controllable, which is a significant advantage compared to the previous approaches and important for many anthropometric applications. The three proposed models are evaluated in details and compared to the state-of-the-art methods.Razumijevanje tjelesnih mjera unutar i između populacija važno je u brojnim primjenama u medicini, anketiranju, modnoj industriji, fitnessu i zabavnoj industriji. Nedavni napretci u mjerenju ljudskog tijela i procjeni njenog oblika su značajnim dijelom vođeni statističkim modelima i dubokim učenjem, omogućujući postupke koji procjenjuju 3D mreže tijela (engl. mesh) iz 3D oblaka točaka (engl. point clouds) i 2D slika, tzv. regresijske metode za 3D mreže. Ovaj doktorski rad nadogradnja je na najsuvremenije regresijske metode za 3D mreže iz više slika. Prvi korak je prijedlog najjednostavnije metode i njeno korištenje kao osnovice. Osnovica je model linearne regresije koja uzima jedino samoprocijenjenu visinu i težinu osobe i procjenjuje pripadnu 3D mrežu. Predložena osnovica radi iznenađujuće dobro u usporedbi s najsuvremenijim metodama. Drugi doprinos je model za procjenu 3D poze čovjeka iz više pogleda. Novost modela je u tome što na ulazu može primiti bilo koji skup kamera, neovisno o njihovom relativnom prostornom rasporedu i broju maera. Treći doprinos je model za procjenu parametara ljudskog položaja, oblika i odjeće iz jednog pogleda. Procijenjeni parametri su intepretabilni i, stoga, upravljivi, što je značajna prednost u odnosu na prijašnje pristupe i važna za mnoge primjene u antropometriji. Tri predložena modela detaljno su ocijenjena i uspoređena s najsuvremeniijim pristupima

    Short-term energy-generating product price forecasting

    Predviđanje cijene sirove nafte važno je zbog značajnog utjecaja na trendove rasta i pada cijena za cjelokupno globalno tržište. Opisan je problem kratkoročnog predviđanja cijene sirove nafte. Predstavljeno je i opisano moguće rješenje korištenjem skupa regresijskih stabala i optimizacije aditivnim gradijentnim spustom. Model je evaluiran različitim regresijskim metrikama te uspoređen s modelom slučajnih šuma na stvarnom problemu. Rezultati su uspoređeni s postojećim rezultatima znanstvenih radova na temu predviđanja cijene sirove nafte.Crude oil price forecasting is important due to its strong influence on the global economic market. The problem of short-term crude oil price forecasting is described. The solution of short-term crude oil price forecasting using regression trees is implemented and described accordingly. Ensemble model is evaluated with various regression metrics on concrete problem. Results are compared with the existing results found in relevant papers

    Unsupervised learning of depth estimation with monocular vision

    Rekonstrukcija strukture prirodnih scena vrlo je važan sastojak mnogih praktičnih primjena računalnog vida. Posebno su zanimljivi pristupi koji učenje provode na slijedu slika jedne kamere jer ne zahtijevaju komplicirane sustave za pribavljanje i označavanje podataka za učenje. U posljednje vrijeme, veliki uspjeh u tom području ostvaruju metodu temeljene na dubokim konvolucijskim modelima. U okviru rada, proučen je model nenadziranog učenja procjene geometrijskih značajki iz monokularnog niza slika. Opisana je implementacija dubokog modela i prikazani su rezultati dubinskih mapa dobivenih treniranim modelom.Natural scene reconstruction is a very important ingredient for a lot of practical application in computer vision. Learning from sequence of monocular images is particularly interesting as they don't demand complicated systems for fetching and labeling the dataset. Recently, a grand success is achieved using methods based on deep convolutional networks. Unsupervised model for learning scene geometry from monocular image sequence is analyzed in this work. The implementaion of deep model is described and the depth maps, generated using the model, are shown

